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■r^ I We introduce a new class of functions that can be minimized in polynomial time in the value 

oracle model. These are functions / satisfying f{x) + f{y) > f{x Fly) + f{x U y) where the 
domain of each variable Xi corresponds to nodes of a rooted binary tree, and operations n, U 
are defined with respect to this tree. Special cases include previously studied L'i-convex and 
bisubmodular functions, which can be obtained with particular choices of trees. We present 
a polynomial-time algorithm for minimizing functions in the new class. It combines Murota's 
tyj ' steepest descent algorithm for L'-convex functions with bisubmodular minimization algorithms. 

^r^ ! 1 Introduction 

>, 
On , Let / : P — )• M be a function of n variables x = (xi, . . . , x„) where Xi G Di; thus T> = Di x . . . x Z)„. 

We call elements of Di labels^ and the argument of / a labeling. Denote 1^ = {1, . . . , n} to be the 

set of nodes. We will consider functions / satisfying 

O' f{x) + f{y)>f{xV^y) + f{xl^y) yx,yeV (1) 

O' 

where binary operations n,\J : D x V —^ V (expressed component-wise via operations n,U : 

Di X Di —?■ Di) are defined below. 
k> , There are several known cases in which function / can be minimized in polynomial time in the 

^ I value oracle model. The following two cases will be of particular relevance: 

• L^-convex function^ Di = {0, 1, . . . , Ki} where Ki > is integer, aHb = [^2~J' Q^ U 5 = 



[^^] . Property ([T]) is then called discrete midpoint convexity 

• Bisubmodular functions: Di = { — 1, 0, +1}, aUb = sign(a + b), aVlb = |a6|sign(a + b). 

In this paper we introduce a new class of functions which includes the two classes above as 
special cases. We assume that labels in each set Di are nodes of a tree Tj with a designated root 
ri ^ Di. Define a partial order ■< on Di as follows: a ^ 5 if a is an ancestor of 6, i.e. a lies on the 
path from b to r^ (a, 6 G Di). For two labels a,b G Di let V[a — ?> b] be unique path from a to 6 
in Ti, p{a,b) be the number of edges in this path, and V[a -^ b,d] for integer d > be the d-th 
node of this path so that V[a — >■ 6, 0] = a and V[a — > b, p{a, b)] = b. If d > p{a, b) then we set by 
definition V[a —^ b,d] = b. 



Pronounced as "L-natural convex". 
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Figure 1: Examples of trees. Roots are al^vays at the bottom, (a) Illustration of the 
definition of a n 6, a U b, a A b and aM b. (b) A tree for L^-convex functions, (c) A tree for 
bisubmodular functions, (d) A tree for which a weakly tree-subniodular function can be minimized 
efficiently (see section (H) . 

With this notation, we can now define a n 6, a U 6 as the unique pair of labels satisfying the 
following two conditions: (1) {a Fl 6, a U 6} = {V[a -^ b, [^W-,V[a -^ b, [|]]} where d = p{a,b), 
and (2) a n b < a U 6 (Figure Hl^a)). We call functions / satisfying condition ([T|) with such 
choice of (P, n, U) strongly tree-submodular. Clearly, if each Tj is a chain with nodes 0,1, ... ,K 
and being the root (Figure [T{b)) then strong tree-submodularity is equivalent to L^-convexity 
Furthermore, if each Tj is the tree shown in Figure[T]|^c) then strong tree-submodularity is equivalent 
to bisubmodularity. 

The main result of this paper is the following 

Theorem 1. // each tree Ti is binary, i.e. each node has at most two children, then a strongly 
tree-submodular function f can be minimized in time polynomial in n and maxj \Di\. 

Weak tree-submodularity We will also study alternative operations on trees, which we denote 
as A and V. For labels a, 6 G Dj we define a A 6 as their highest common ancestor, i.e. the unique 
node on the path Via — > b] which is an ancestor of both a and b. The label a V 6 is defined as the 
unique label on the path V[a -^ b] such that the distance between a and a V 6 is the same as the 
distance between a A 6 and b (Figure HJ^a)). 

We say that function / is weakly tree-submodular if it satisfies 

f{x) + f{y)>f{xAy) + f{x\/y) yx,yeV (2) 

We will show that strong tree-submodularity ([TJ implies weak tree-submodularity ([2]), which jus- 
tifies the terminology. If all trees are chains shown in Figure [D^b) {Di = {0, 1, . . . , K} with being 
the root) then A and V correspond to the standard operations "meet" and "join" (min and max) 
on an integer lattice. It is well-known that in this case weakly tree-submodular functions can be 
minimized in time polynomial in n and K |391 132]. In section U] we give a slight generalization of 
this result; namely, we allow trees shown in Figure [T]J^d) . 

1.1 Related work 

Studying operations (Pl, U) that give rise to tractable optimization problems received a considerable 
attention in the literature. Some known examples of such operations are reviewed below. For 



simplicity, we assume that domains Di (and operations (n, U)) are the same for all nodes: Di = D 
for some finite set D. 

Submodular functions on lattices The first example that we mention is the case when D 
is a distributive lattice and n, U are the meet and joint operations on this lattice. Functions that 
satisfy ([T]) for this choice of D and n, U are called submodular functions on D; it is well-known 
that they can be minimized in strongly polynomial time [1H\ [571I19J . 

Recently, researchers considered submodular functions on non-distributive lattices. It is known 
that a lattice is non-distributive if it contains as a sublattice either the pentagon A/5 or the diamond 
Ai^. Krokhin and Larose [27] proved tractability for the pentagon case, using nested applications of 
a submodular minimization algorithm. The case of the diamond was considered by Kuivinen [28], 
who proved pseudo-polynomiality of the problem. The case of general non-distributive lattices is 
still open. 

L^-convex functions The concept of L^-convexity was introduced by Fujishige and Murota [16] 
as a variant of L-convexity by Murota [30]. L''-convexity is equivalent to the combination of 
submodularity and integral convexity [13] (see [32] for details). 

The fastest known algorithm for minimizing L'-convex functions is the steepest descent algo- 
rithm of Murota |3 HI32ll33] . Murota proved in [33] that algorithm's complexity is 0(nmin{i^, nlogi^}• 
SFM(n)) where K = maxj \Di\ and SFM(n) is the complexity of a submodular minimization algo- 
rithm for a function with n variables. The analysis of Kolmogorov and Shioura [22] improved the 
bound to 0(min{i^, nlog/f} • SFM(n)). In section [2] we review Murota's algorithm (or rather its 
version without scaling that has complexity 0{K ■ SFM(n)).) 

Note, the class of L^-convex functions is a subclass of submodular functions on a totally ordered 
set D = {0,1,..., i^}. 

Bisubmodular functions Bisubmodular functions were introduced by Chandrasekaran and 
Kabadi as rank functions of (poly-)pseudomatroids [3 [21]. Independently, Bouchet ^ introduced 
the concept of A-matroids which is equivalent to pseudomatroids. Bisubmodular functions and 
their generalizations have also been considered by Qi [35], Nakamura [34], Bouchet and Cunning- 
ham [4] and Fujishige [15]. 

It has been shown that some submodular minimization algorithms can be generalized to bisub- 
modular functions. Qi |35] showed the applicability of the ellipsoid method. Fujishige and 
Iwata [17] developed a weakly polynomial combinatorial algorithm for minimizing bisubmodu- 
lar functions with complexity O(n^EOlogM) where EO is the number of calls to the evaluation 
oracle and M is an upper bound on function values. McCormick and Fujishige [29] presented 
a strongly combinatorial version with complexity O(n^EOlogn), as well as a O(n^E0log n) fully 
combinatorial variant that does not use divisions. The algorithms in [29] can also be applied for 
minimizing a bisubmodular function over a signed ring family, i.e. a subset 7^ C D closed under Fl 
and U. 

Valued constraint satisfaction and multimorphisms Our paper also fits into the frame- 
work of Valued Constraint Satisfaction Problems (VCSPs) [H]. In this framework we are given a 
language F, i.e. a set of cost functions / : D"* — t- M+ U {+00} where Z? is a fixed discrete domain 
and / is a function of arity m (different functions / E F may have different arities). A T -instance 
is any function / : D^ — >■ M_|_ U {+cxd} that can be expressed as a finite sum of functions from F: 

f{xi,...,Xn) = '^ft{Xi(t,l),---,Xi(t,Tnt)) 

teT 

where T is a finite set of terms, /t € F is a function of arity mt, and i{t, k) are indexes in 
{1, . . . , n}. A finite language F is called tractable if any F-instance can be minimized in polynomial 



time, and NP-hard if this minimization problem is NP-hard. These definitions are extended to 
infinite languages T as follows: T is called tractable if any finite subset F' C F is tractable, and 
NP-hard if there exists a finite subset F' C F which is NP-hard. 

Classifying the complexity of different languages has been an active research area. A major 
open question in this line of research is the Dichotomy Conjecture of Feder and Vardi (formulated 
for the crisp case), which states that every constraint language is either tractable or NP-hard |14j . 
So far such dichotomy results have been obtained for some special cases, as described below. 

A significant progress has been made in the crisp case, i.e. when F only contains functions 
/ : D"^ -^ {0, -|-oo}. The problem is then called Constraint Satisfaction (CSP). The dichotomy is 
known to hold for languages with a 2-element domain (Schaefer [36]), languages with a 3-element 
domain (Bulatov (6j), conservative languagecl (Bulatov [5j), and languages containing a single 
relation without sources and sinks (Barto et al. [1]). All dichotomy theorems above have the 
following form: if all functions in F satisfy a certain condition given by one or more polymorphisms 
then the language is tractable, otherwise it is NP-hard. 

For general VCSPs the dichotomy has been shown to hold for Boolean languages, i.e. languages 
with a 2-element domain (Cohen et al. [llj), conservative languages (Kolmogorov and Zivny [23l|2U 
[25] . who generalized previous results by Deineko et al. [12] and Takhanov [38]), and {0, l}-valued 
languages with a 4-element domain (Jonsson et al. [20]). In these examples tractable subclasses 
are characterized by one or more multimorphisms, which are generalizations of polymorphisms. 
A multimorphism of arity k over D is a tuple (OPi, . . . , OPfc) where OPj is an operation D^ — >• D. 
Language F is said to admit multimorphism (OPi, . . . , OPfc) if every function f &T satisfies 

f{xi) + ... + f{xk) > /(OPi(a;i, . . . , Xk)) + ... + /(OPfc(a;i, . . . , Xk)) 

for all labelings Xi, . . . ,Xk with f{xi) < -l-cxo, . . ., f{xk) < -|-oo. (The pair of operations (n, U) 
used in ([TJ is an example of a binary multimorphism.) The tractable classes mentioned above (for 
\D\ > 2) are characterized by complementary pairs of STP and MJA?^ multimorphisms [24] (that 
generalized symmetric tournament pair (STP) multimorphisms [10]), and 1-defect chain multi- 
morphisms [20] (that generalized tractable weak-tree submodular functions in section [4] originally 
introduced in [26]). 

To make further progress on classifying complexity of VCSPs, it is important to study which 
multimorphisms lead to tractable optimisation problems. Operations (n, U) and (A, V) introduced 
in this paper represent new classes of such multimorphisms: to our knowledge, previously re- 
searchers have not considered multimorphisms defined on trees. 

Combining multimorphisms Finally, we mention that some constructions, namely Cartesian 
products and Malt'stev products, can be used for obtaining new tractable classes of binary multi- 
moprhisms from existing ones |27] . Note, Krokhin and Larose [27] formulated these constructions 
only for lattice multimorphisms (n,U), but the proof in [27] actually applies to arbitrary binary 
multimorphisms (n,U). 

2 Steepest descent algorithm 

It is known that for L^-convex functions local optimality implies global optimality [32] . We start by 
generalizing this result to strongly tree-submodular functions. Let us define the following "local" 



^A crisp language F is called conservative if it contains all unary cost functions / : D — >■ {0, +00} j5j. A 
general- valued language is called conservative if it contains all unary cost functions / : D — )■ R+ [23ll24ll25] . 



neighborhoods of labehng x £ V: 



NEIB(a;) = {y £ V \ p{x,y) < 1} 
INWARD(a;) = {y e NEIB(a;) \y ^x} 
OUTWARD(a;) = {y e NEIB(a;) \y hx} 



where u ^v means that Ui :< Vi for ah i £V, and p{x,y) = maxi^v pixi,yi) is the /oo-distance 
between x and y. Clearly, the restriction of / to INWARD(a;) is a submodular function, and the 
restriction of / to OUTWARD(a;) is bisubmodular assuming that each tree Tj is binarjO. 

Proposition 2. Suppose that f{x) = min{/(y) [ y € INWARD(a;)} = min{/(y) | y G OUTWARD(tc)}. 
Then x is a global minimum of f . 

Proof. First, let us prove that f{x) = min{/(y) | y € NEIB(a;)}. Let x* be a minimizer of / in 
NEIB(a;), and denote V* = {y € V \ yi e V* = {xi,x*}} C NEIB(a;). We treat set V* as a tree 
with root Xi n x*. Clearly, the restriction of / to D* is an L^-convex function under the induced 
operations n, U. It is known that for L^-convex functions optimality of x in sets {y £ V* \y ^ x} 
and {y € T>* \y ^ x} suffices for optimality of x in V* [32, Theorem 7.14], therefore f{x) < f{x*). 
This proves that f{x) = min{/(y) | y G NEIB(a;)}. 

Let us now prove that x is optimal in T>. Suppose not, then there exists y £ T) with f{y) < 
f{x). Among such labelings, let us choose y with the minimum distance p{x,y). We must have 
y ^ NEIB(k), so p{x,y) > 2. Clearly, p{x,x U y) < p{x,y) — 1 and p{x,x r\ y) < p{x,y) — 1. 
Strong tree-submodularity and the fact that f{y) < f{x) imply that the cost of at least one of the 
labelings xUy, xHy is smaller than f{x). This contradicts to the choice of y. D 

Suppose that each tree Tj is binary. The proposition shows that a greedy technique for com- 
puting a minimizer of / would work. We can start with an arbitrary labeling x £ T>, and then 
apply iteratively the following two steps in some order: 

(1) Compute minimizer x^^ £ argmin{/(y) | y E INWARD(ic)} by invoking a submodular mini- 
mization algorithm, replace x with x^^ if f{x^^) < f{x). 

(2) Compute minimizer x°^^ € argmin{/(y) | y € OUTWARD(a;)} by invoking a bisubmodular 
minimization algorithm, replace x with a;°"* if f{x°^^) < f{x). 

The algorithm stops if neither step can decrease the cost. Clearly, it terminates in a finite number 
of steps and produces an optimal solution. We will now discuss how to obtain a polynomial number 
of steps. We denote K = maxj \Di\. 

2.1 L^convex case 

For L''-convex functions the steepest descent algorithm described above was first proposed by 
Murota |3 H I32 1 [33]. except that in step 2 a submodular minimization algorithm was used. Murota's 
algorithm actually computes both of x^^ and x°^^ for the same x and then chooses a better one 
by comparing costs /(a;^") and /(a;°'^*). A slight variation was proposed by Kolmogorov and 
Shioura [22], who allowed an arbitrary order of steps. Kolmogorov and Shioura also established a 
tight bound on the number of steps of the algorithm by proving the following theorem. 



''if label Xi has less than two children in Ti then variable's domain after restriction will be a strict subset of 
{ — 1,0, +1}. Therefore, we may need to use a bisubmodular minimization algorithm over a signed ring familiy 



Theorem 3 ([22]). Suppose that each tree Ti is a chain. For a labeling x £ D define 

p'{x) = mm{p{x,y)\y£OPT'[x]},OPT'[x] = aTgmm{f{y)\yeV,y ^ x} (3a) 

p+{x) = mm{p{x,y)\y£OPT+[x]},OPT+[x] = aTgmm{f{y)\y£V,y >z x} (3b) 

(a) Applying step (1) or (2) to labeling x£T> does not increase p~{x) and p^{x). 

(b) If p^{x) > 1 then applying step (1) to x will decrease p^{x) by 1. 

(c) If p^{x) > 1 then applying step (2) to x will decrease p^{x) by 1. 

In the beginning of the algorithm we have p^{x) < K and p^{x) < K, so the theorem imphes 
that after at most K calls to step (1) and K cahs to step (2) we get p~{x) = p'^{x) = 0. The 
latter condition means that f{x) = min{/(y) | y G INWARD(a;)} = min{/(j/) | y G OUTWARD(a;)}, 
and thus, by proposition [21 a; is a global minimum of /. 

2.2 General case 

We now show that the bound 0{K) on the number of steps is also achievable for general strongly 
tree-submodular functions. We will establish it for the following version of the steepest descent 
algorithm: 

50 Choose an arbitrary labeling x° ^T) and set x := x° . 

51 Compute minimizer a;^" € argmin{/(y) jy G INWARD(a;)}. \i f{x^^) < f{x) then set x := x^^ 
and repeat step SI, otherwise go to step S2. 

52 Compute minimizer x°'^'^ G argmin{/(y) | y G OUTWARD(a;)}. If /(a;°"*) < f{x) then set 
X := x°^^ and repeat step S2, otherwise terminate. 

Note, one could choose x° to be the root of tree Ti for each node i & V, then step SI would be 
redundant. 

Theorem 4. (a) Step SI is performed at most K times, (b) Each step S2 preserves the following 
property: 

f{x) = min{/(y) | y G INWARD(a;)} (4) 

(c) Step S2 is performed at most K times, (d) Labeling x produced upon termination of the 
algorithm is a minimizer of f. 

Proof. For a labeling x ^V denote ^^[a?] = {y G X^ | y ^ x}. We will treat domain T>''[x\ as the 
collection of chains with roots r^ and leaves Xi. Let p~{x) be the quantity defined in ([3ajl . There 
holds 

f{x) = min{/(y) | y G INWARD(a;)} ^ p~{x) = (5) 

Indeed, this equivalence can be obtained by applying proposition [2] to function / restricted to 

V-[x]. 

(a) When analyzing the first stage of the algorithm, we can assume without loss of generality 
that T> = T>~[x°], i.e. each tree Tj is a chain with the root ri and the leaf x°. Indeed, removing 
the rest of the tree will not affect the behaviour of steps SI. With such assumption, function / 
becomes L^-convex. By theorem [3]^b) , steps SI will terminate after at most K steps. 
(b,c) Property Q (or equivalently p~{x) = 0) clearly holds after termination of steps SI. Let z 
be the labeling upon termination of steps S2. When analyzing the second stage of the algorithm. 



we can assume without loss of generality that V = V^lz], i.e. each tree Ti is a chain with the root 
Tj and the leaf Zi. Indeed, removing the rest of the tree will not affect the behaviour of steps S2. 
Furthermore, restricting / to 'D~[z] does not affect the definition of p~{x) for x € T>~[z]. 

By theorem [3l[ a) , steps S2 preserve p~{x) = 0; this proves part (b). Part (c) follows from 
theorem [3l^c). 

(d) When steps S2 terminate, we have f{x) = min{/(y) | y S OUTWARD(a;)}. Combining this 
fact with condition @ and using proposition [2] gives that upon algorithm's termination a; is a 
minimizer of /. 

D 

3 Translation submodularity 

In this section we derive an alternative definition of strongly tree-submodular functions. As a 
corollary, we will obtain that strong tree submodularity ([1]) implies weak tree submodularity ([2]). 
Let us introduce another pair of operations on trees. Given labels o, 6 € Di and an integer 
d > 0, we define 

afb = V[a^b,d]Ab aidb = V[a^ b, p{a f b, b)] 

In words, a t*^ 5 is obtained as follows: (1) move from a towards 6 by d steps, stopping if b is 
reached earlier; (2) keep moving until the current label becomes an ancestor of b. a ^d b is the 
label on the path V[a -^ b] such that the distances p{a,a J,^ b) and p{a f^ b,b) are the same, as 
well as distances p{a,a f^ b) and p{a Id b,b). Note, binary operations t'^,!^: Di x Di ^ Di (and 
corresponding operations t'^jid: T> x T> ^ V) are in general non-commutative. One exception 
is d = 0, in which case t , id reduce to the commutative operations defined in the introduction: 
xi'^ y = X Ay and x lo y = xV y. 

For fixed labels a,b G Di it will often be convenient to rename nodes in Via — )• b] to be 
consecutive integers so that a A 6 = and a < < 6. Then we have a = —p{a, aAb), b = p{a A 6, b) 
and 

a^'^ b = max{0, min{a + d,b}} a Ub = a + b - {a^'^ b) 

Theorem 5. (a) If f is strongly tree-submodular then for any x,y gT) and integer d > there 
holds 

f{x) + f{y) > f{x f y) + f{x id y) (6) 

(b) If ([6]) holds for any x,y &T> and d>0 then f is strongly tree-submodular. 

Note, this result is well-known for L^-convex functions [32^ section 7.1], i.e. when all trees are 
chains shown in Figure [T||b); inequality ([6]) was then written as f{x) + f{y) > f{{x + d-l) Ay) + 
f{x y {y — d ■ 1)), and was called translation submodularity. In fact, translation submodularity 
is one of the key properties of L^-convex functions, and was heavily used, for example, in [22] for 
proving theorem [31 

Setting d = in theorem [5]J^a) gives 

Corollary 6. A strongly tree-submodular function f is also weakly tree-submodular, i.e. ([T|) implies 

A proof of parts (b) and (a) of theorem [6| is given in sections 13.11 and 13.21 respectively. In both 
proofs we always implicitly assume that for each i G V labels in V[xi -^ yi\ are renamed to be 
consecutive integers with Xj A ?/j = and Xj < < j/j. 

7 



3.1 Proof of theorem [5](b) 

We prove inequality Q for x,y & V using induction on pi{x,y) = "^i^y pixi,yi). The base case 
Pi{x,y) = 0, OT X = y, is trivial; suppose that pi{x,y) > 1. Denote d^^x = pi^,y) ^ 1 and 
d = [dmax/2j > 0. Two cases are possible. 

Case 1 dmax is even. We can assume without loss of generality that there exists k ^V such that 
Uk — Xk = dmax and \xk\ > Vk- (If there is no such k, we can simply swap x and y; inequality ([1]) 
will be unaffected since operations n,U are commutative, and p{x,y), pi{x,y) will not change.) 
Consider labelings x',y' G V defined as follows: 

yi= { ,, . Xi = XiUyi 

I yi otherwise 

for each i ^V . We claim that 

(a) xUy' = xUy (b) xUy' = x' 

(c) x'1^'^y = y' (d) x'idy = xUy 

Indeed, for each node i (zV one of the following holds: 

• Vi — Xi < (imax " 1- Then y[ = y-i, x[ = XilA yi, so (a) and (b) hold for node i. We also have 
yi-x[=yi- {xi U yi) < \{yi - Xi)/2] < [(dmax - l)/2l < d, which implies (c) and (d). 

• yi- Xi = dmax and \xi\ < yi. Then y'^ = yi, x[ = XiUyi = {xi + yi)/2, yi- x[ = d; as above, 
this implies (a)-(d). 

• Vi-Xi = dmax and \xi\ > yi. Then y^ = yi-l, x\ = XiUy'- = [{xi+yi-l)/2\ = {xi+yi)/2-l = 
y[ — d. Checking that (a)-(d) hold is straightforward. 

We have y[, = y^ — 1, and so pi{x,y') < pi{x,y). Therefore, 

fix) + f{y') > fix n y') + fix U y') fix') + /(y) > fix' f y) + fix' U v) 

where the first inequality follows from the induction hypothesis and the second one follows from 
([6]). Summing these inequalities and subtracting fix') + fiy') from both sides using (a)-(d) gives 

©• 

Case 2 dmax is odd. By swapping x and y, if necessary, we can assume without loss of generality 
that there exists k £ V such that yu—Xk = dmax and \xk\ < yk- (Note, we cannot have yi—Xi = dmax 
and \xi\ = yi since dmax is odd). Consider labelings x',y' S V defined as follows: 

/ ] Xi -\- 1 II yi Xi — Umax ; | ^i | ^ 2/i / / ^ 

Xi = < ., • yi = Xinyi 

I Xi otherwise 

for each i ^V . We claim that 

(a) x' r\y = y' (b) x' \Jy = x \Jy 

(c) X 'l'^ y' = X r\ y (d) x Id y' = x' 

Indeed, for each node i gV one of the following holds: 

• yi — Xi < dmax — 1- Then x'^ = Xi, y'^ = Xi n yi, so (a) and (b) hold for node i. We also have 
y'i- Xi = ixi n yi) - Xi < \iyi - Xj)/2] < [(dmax - l)/2l < d, which implies (c) and (d). 



• Ui- Xi = dmax and \xi\ > yi. Then x[ = Xi, y[ = XiUyi = \{xi + yi)/2] < 0, so (a) and (b) 
hold for node i. (c) and (d) hold since y'^ < 0. 

• Vi- Xi = dmax and \xi\ < yi. Then x- = Xj + 1, y • = a; ■ n yj = [(xj + yi - l)/2j. Checking 
that (a)-(d) hold is straightforward. 

We have x'^ = Xfc + 1, and so pi{x', y) < pi{x, y). Therefore, 

fix') + f{y) > fix' ny) + fix' U y) fix) + fiy') > fix f y') + fix U v') 

where the first inequality follows from the induction hypothesis and the second one follows from 
([6]). Summing these inequalities and subtracting fix') + fiy') from both sides using (a)-(d) gives 

3.2 Proof of theorem [51(a) 

We say that the triplet (x, y, d) is valid ii x,y ^V and d G [0, pix, y)]. We denote z = x ^'^ y; we 
have Xi <0 < Zi < yi. Let us introduce a partial order ^ over valid triplets as the lexicographical 
order with variables (yi — xi, . . . ,y„ — x,i, —d). Note, the last component —d is the least significant. 
We use induction on this partial order. The induction base is trivial: if the first n components are 
zeros then a; = y so ([6]) is an equality, and if the last component is minimal (i.e. d = pix, y)) then 
X ^'^ y = y and x Id y = x, so <^ is again an equality. Suppose that x ^ y and d < pix, y) — 1. 
Consider integer d' > d, and denote y' = x ^"^ ~^^ y and (5j = y^ — z, > for i € V. Suppose 
that 6i G {0, 1} for all nodes i gV. (This holds, for example, if d' = d.) Denote x' = x \d v' i then 
X- = Xj + y^ - (xj t"^ y-) = Xj + y- - Zj = Xj + 6i. We claim that 

(a) x'f'^y' = xYy (b) x U v' = x' 

(c) x' t"'' y = y' (d) x' ld'y = xidy ^ ^ 

In order to prove it, let us consider node i. Property (a) follows from the fact that y[ > Xj t Hi- 
Property (b) is the definition of x'. To prove (c), consider two possible cases: 

• Si = 0, so x^ = Xi and y^' = Xj t'^ '^^ yi = x, t'^ yj. The latter condition and the fact d' + 1 > d 
imply that Xi + d> yi, therefore x'^ + d' > yi = y'-. This leads to (c). 

• di = 1. \i y'^ = yi then condition (c) is straightforward (it follows from x^ t Vi ^ v'i)- 
Suppose that y'i < yi, then from definition of y^ we have Xi + d' + 1 < y'^, or x^ + d' < y^. 
This leads to (c). 

Finally, properties (c) and (d) are equivalent since 

x'i + Vi- y'i - [xi id Vi] = [xi + y'i- ixi t'' y'i)] +yi-y'i- [xi + yi- ixi t*^ yi)] 

= ixifyi)-ix,fy',) = d 

Now suppose that in addition to conditions 5i € {0, 1} there holds x' ^ x and y' ^ y. Then 
we have ix,y',d) -< ix,y,d) and ix',y,d') -< ix,y,d), so by the induction hypothesis 

fix) + fiy') > fix f y') + fix U y') fix')+fiy) > fix' f y) + fix' id' y) 

Summing these inequalities and subtracting fix') + fiy') from both sides using (a)-(d) gives ^. 

Let us describe cases when the argument above can be applied; such cases can be eliminated 

from consideration. First, suppose that yj — Zj > 2 for some node j E V, then there exists d' > d 
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1} 



such that the labehng y' = a; f^ +^ y has at least one node j £ V with y'- G [zj + l,yj — I]. Let 
us choose the minimum integer d' that has this property. Then 6i G {0, 1} for all nodes i € V, 
since 6i > 2 would contradict to the minimality of chosen d' . We also have y'- 7^ yj and x'- ^ Xj 
(since x'- — Xj = 5j = 1), so the conditions above are satisfied. Therefore, from now on we assume 
without loss of generality that yi — Zi £ {0, 1} for all nodes i gV. 

We can also take d' = d. Condition 6i G {0, 1} is then satisfied for all nodes. Therefore, 
we can assume without loss of generality that either x' = x or y' = y where y' = x "|"'^+i y^ 
x' = X Id y' , otherwise the induction argument above could be applied. Suppose that x' = x. 
This is equivalent to x \'^ y' = y', or to the following condition for all nodes i £ V: either Xi + d < 
or yi — Xi > d. It can be checked that x ^'^'^^ y = x ^'^ y and x \.d+i y = x Id y- Furthermore, 
{x,y,d+ 1) -< {x,y,d), so ([6]) follows by the induction hypothesis. We thus assume from now on 
that y' = y. 

Equations below summarize definitions and assumptions made so far: 

(8a) 
(8b) 
(8c) 
(8d) 

Let S be the set of nodes i (zV with 6i = 1. It is straighforward to check that 

i & S =^ Xi + d = Zi = yi — 1 (8e) 

i eV - S =^ Xi\'^ yi = yi and Xi U Vi = Xi = x\ (8f ) 

If S is empty then a; t"^ y = y, a; J,^ y = a;, so inequality (j6]) is trivial. Thus, we can 
assume that S is non-empty. Suppose that S contains two distinct nodes i and j. Let us modify 
labelings x' and y' as follows: for node j set x'- = Xj, y'- = Zy It is straightforward to check that 
conditions ([7]) for d' = d still hold. Furthermore, x[ > Xi, y'j < yj, so {x,y',d) -< {x,y,d) and 
{x',y,d) -< {x,y,d). Applying the argument described above gives Q. 

We are left with the case when S contains a single node j. We will consider 5 possible subcases. 
In 4 of them, we will do the following: (i) specify new labelings x' and y' with x[,y[ G [xi,yj] for 
each node i; (ii) specify four identities involving x,y,x',y' such that the right-hand sides contain 
expressions x',y',x t y,x Xd y, and the left-hand sides contain expressions of the form x <>i y', 
X Oi y' , x'o2 y, x' 02 y where o^ is one of the operations n, U,t , idj. and o^ is the corresponding 
"symmetric" operation. This will describe how to prove ([U]): we would need to sum two inequalities 

fix) + /(yO > /(a; oi y') + j(x oi y') f{x') + f{y) > f{x 02 y') + f{x 02 y') 

that hold either by strong tree-submodularity or by the induction hypothesis, then use provided 
identities to prove ^. Checking the identities and the applicability of the induction hypothesis 
in the case of operations t , id^ is mechanical, and we omit it. 
Case 1 Zj = Xj + d > 1 (implying d > 1). The identities are 

(a) X f^-i y' = x' (b) x |d_i y' = x U V (g-. 

(c) a;' U y = a; t"* y (d) a;' n y = y' ^ ^ 

and labelings x',y' are defined as follows: 
• iii = j set x'j = yj - 2, y'j = yj - 1; 

10 



• otherwise if Xj + d = y^ > set x[ = y[ = y-i — 1. 

• otherwise (if y^ = or x^ + (i > y^) set x[ = y[ = yi. 

The remainder is devoted to the case Zj = Xj + d = Q. Note that we must have yj = 1. 
Case 2 d > 1, Zj = Xj + d = and there exists node k G V — {j} with Xk = 0, y^ > 0. Then 

(a) x'Yy = xfy (b) x' U y = y' .^^. 

(c) xUy' = x' (d) xHy' = X Uy 

x', y' are defined as fohows: 

• if i = j set x'j = Xj, y'j = Xj + 1; 

• otherwise if i = fe set x'^ = y^ = x^ + 1 = 1 ; 

• otherwise set x^ = y^ = Xj. 

Case 3 d > 1, Zj = Xj + d = and there is no node k G V — {j} with Xjt = 0, y^ > 0. The 
identities are 

(a) x'^''~^y = xfy (b) x' U-i y = y' ^^x 

{c) xUy' = xUy {d) xny' = x' 

x' , y' are defined as fohows: 

• if i = j set x'- = Xj + 1, y'- = Xj + 2; 

• otherwise if Xj < set x^ = y^ = Xj + 1; 

• otherwise (if Xi = yi = 0) set x'^ = y[ = 0. 

Note, to verify identities (fTTj) for node j, one should consider cases d = 1 and d>2 separately. 



Case 4 d = (implying Xj = 0, yj = 1) and there exists node k gV — {j} with x^ < 0. Then 

(a) a; t° y' = x' (b) a; lo y' = a; io 2/ ^2) 

{c) x' Uy = y' (d) a;' n y = a; t° y 

a;', y' are defined as follows: 

• if i = j set x'j = 0, y' = 1; 

• otherwise if x, < 0, y^ = set x'^ = y[ = —1; 

• otherwise (if Xi = yi = 0) set x^ = y^ = 0. 

Case 5 d = (implying Xj = 0, yj = 1) and there is no node k G V — {j} with x^ < 0. Thus, 
Xi = yi = for alH € y — {j}. There holds a; t*^ y = a;, a; 4,9 y = y^ so inequality ^ is trivial. 
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4 Weakly tree-submodular functions 

In this section we consider functions / that satisfy condition ([2]), but not necessarily condition ([1]). 
It is well-known [391 [32] that such functions can be minimized efficiently if all trees Tj are chains 
rooted at an endpoint and maxj |Dj| is polynomially bounded. The algorithm utilizes Birkhoff's 
representation theorem [2] which says that there exists a ring family TZ such that there is an 
isomorphism between sets D and TZ that preserves operations A and V. (A subset TZ C {0, 1}'" 
is a ring family if it is closed under operations A and V.) It is known that submodular functions 
over a ring family can be minimized in polynomial time, which implies the result. Note that the 
number of variables will be m = 0(^^ \Di\)- 

Another case when / satisfying ([2]) can be minimized efficiently is when / is bisubmodular, i.e. 
all trees are as shown in Figure[IJc). Indeed, in this case the pairs of operations (n, U) and (A, V) 
coincide. 

An interesting question is whether there exist other classes of weakly tree-submodular functions 
that can be minimized efficiently. In this section we provide one rather special example. We 
consider the tree shown in Figure Wid). Each Tj has nodes {0, 1, . . . ,K, K^i, i^+i} such that is 
the root, the parent of k for k = 1, . . . ,K is k — 1, and the parent of K^i and X+i is K. 

In order to minimize function / for such choice of trees, we create K+1 variables yio,yii, ■ ■ ■ , ytx 
for each original variable Xj G Di. The domains of these variables are as follows: Z?jo = . . . = 
DiK-i = {0, 1}, DiK = {—1, 0, +1}. Each domain is treated as a tree with root and other nodes 
being the children of 0; this defines operations A and V for domains Dio, . . . Dix-i, DiK- The 
domain P is set as the Cartesian product of individual domains over all nodes i a V. Note, a 
vector y & T^ has n{K + 1) components. 

For a labeling x ^ D let us define labeling y = ip[x) ^ T) as, follows: 





Xi = A; G {0, 1, . . 


■,K} 


=^ 


yio = ■■ 


■ ■ = Vik-i = 1, Vik = ■■■ = ViK 


Xi = K^i 




=^ 


yio = ■■ 


■■ = UiK-l = 1, ViK = -1 


Xi = K+i 




=^ 


yio = ■■ 


■ • = ViK-l = 1, ViK = +1 



It is easy to check that mapping ip : T> —^ T> is injective and preserves operations A and V. 
Therefore, 7^ = Im ^ is a signed ring family, i.e. a subset of 2? closed under operations A and V. 
It is known |29] that bisubmodular functions over ring families can be minimized in polynomial 
time, leading to 

Proposition 7. Functions that are weakly tree-submodular with respect to trees shown in Fig- 
ure\^d) can he minimized in time polynomial in n and maxj \Di\. 

5 Conclusions and discussion 

We introduced two classes of functions (strongly tree-submodular and weakly tree-submodular) 
that generalize several previously studied classes. For each class, we gave new examples of trees 
for which the minimization problem is tractable. 

Our work leaves a natural open question: what is the complexity of the problem for more 
general trees? In particular, can we minimize efficiently strongly tree-submodular functions if 
trees are non-binary, i.e. if some nodes have three or more children? Note that the algorithm in 
section [2] and its analysis are still valid, but it is not clear whether the minimization procedure in 
step S2 can be implemented efficiently. Also, are there trees besides the one shown in Figure [Hd) 
for which weakly tree-submodular functions can be minimized efficiently? 
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More generally, can one characterize for which operations (n,U) the minimization problem 
is tractable? Currently known tractable examples are distributive lattices, some non-distributive 
lattices [271 [28], operations on trees introduced in this paper, and combinations of the above 
operations obtained via Cartesian product and Malt'sev product [27]. Are there tractable cases 
that cannot be obtained via lattice and tree-based operations? 
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